Using EASI Sentinel-1 RTC Gamma0 data¶

This notebook will demonstrate how to load and use Sentinel-1 RTC Gamma0 data generated in EASI.

The processing uses SNAP-10 with a graph processing tool (GPT) xml receipe for RTC Gamma0 and its variants.

For most uses we recommend the smoothed 20 m product (sentinel1_grd_gamma0_20m). We can process the 10 m products (sentinel1_grd_gamma0_10m, sentinel1_grd_gamma0_10m_unsmooth) on request. Please also ask if you wish to trial other combinations of the parameters.

RTC Gamma0 product variants¶

sentinel1_grd_gamma0_20m sentinel1_grd_gamma0_10m sentinel1_grd_gamma0_10m_unsmooth
DEM
copernicus_dem_30 Y Y Y
Scene to DEM extent multiplier 3.0 3.0 3.0
SNAP operator
Apply-Orbit-File Y Y Y
ThermalNoiseRemoval Y Y Y
Remove-GRD-Border-Noise Y Y Y
Calibration Y Y Y
SetNoDataValue Y Y Y
Terrain-Flattening Y Y Y
Speckle-Filter Y Y N
Multilook Y Y N
Terrain-Correction Y Y Y
Output
Projection WGS84, epsg:4326 WGS84, epsg:4326 WGS84, epsg:4326
Pixel resolution 20 m 10 m 10 m
Pixel alignmentArea = top-left Area Area Area

Units and conversions¶

  • intensity = amplitude * amplitude
  • amplitude = sqrt(intensity)
  • dB = 10*log10(intensity)
  • intensity = 10**(dB/10)

Two example functions for scalar values. Below in the notebook we define functions for xarray datasets/arrays.

import math
def intensity_to_db(x):
    return 10*math.log10(x)
def db_to_intensity(db):
    return math.pow(10, db/10.0)

Reference: https://forum.step.esa.int/t/what-stage-of-processing-requires-the-linear-to-from-db-command

Set up¶

Import required packages and functions¶

In [1]:
# Basic plots
%matplotlib inline
# import matplotlib.pyplot as plt
# plt.rcParams['figure.figsize'] = [12, 8]

# Common imports and settings
import os, sys
from pathlib import Path
from IPython.display import Markdown
import pandas as pd
pd.set_option("display.max_rows", None)
import xarray as xr

# Datacube
import datacube
from datacube.utils.rio import configure_s3_access
from datacube.utils import masking
from datacube.utils.cog import write_cog
# https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/Tools
from dea_tools.plotting import display_map
from dea_tools.datahandling import mostcommon_crs

# EASI tools
import git
repo = git.Repo('.', search_parent_directories=True).working_tree_dir  # This gets the current repo directory. Alternatively replace with the easi-notebooks repo path in your home directory
if repo not in sys.path: sys.path.append(repo)
from easi_tools import EasiDefaults, xarray_object_size
from easi_tools.notebook_utils import mostcommon_crs, initialize_dask, localcluster_dashboard, heading

# Data tools
import hvplot.xarray
import cartopy.crs as ccrs
import numpy as np
from dask.diagnostics import ProgressBar
from scipy.ndimage import uniform_filter, variance
from skimage.filters import threshold_minimum

# Dask
from dask.distributed import Client
from dask_gateway import Gateway

EASI environment¶

In [2]:
easi = EasiDefaults('asia')

family = 'sentinel-1'
# product = this.product(family)
product = 'sentinel1_grd_gamma0_20m'
display(Markdown(f'Default {family} product for "{easi.name}": [{product}]({easi.explorer}/products/{product})'))
Successfully found configuration for deployment "asia"

Default sentinel-1 product for "asia": sentinel1_grd_gamma0_20m

Dask and ODC¶

In [3]:
# Start dask cluster - this may take a few minutes
# cluster, client = initialize_dask(workers=2)
# display(client)
# dashboard_address = localcluster_dashboard(client=client, server=easi.hub)
# display(dashboard_address)

cluster, client = initialize_dask(use_gateway=True, workers=5)
display(client)

# ODC
dc = datacube.Datacube()
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client);

# List measurements
dc.list_measurements().loc[[product]]
Starting new cluster.

Client

Client-d6a7b8bd-8b6e-11ef-809e-6a1ab6bab8d8

Connection method: Cluster object Cluster type: dask_gateway.GatewayCluster
Dashboard: https://hub.asia.easi-eo.solutions/services/dask-gateway/clusters/easihub.c378443398044921a86daac31164ffe1/status

Cluster Info

GatewayCluster

  • Name: easihub.c378443398044921a86daac31164ffe1
  • Dashboard: https://hub.asia.easi-eo.solutions/services/dask-gateway/clusters/easihub.c378443398044921a86daac31164ffe1/status
Out[3]:
name dtype units nodata flags_definition aliases add_offset scale_factor
product measurement
sentinel1_grd_gamma0_20m vh vh float32 intensity NaN NaN [gamma0_vh] NaN NaN
vv vv float32 intensity NaN NaN [gamma0_vv] NaN NaN
angle angle float32 degrees NaN NaN [local_incidence_angle, localincidenceangle] NaN NaN

Choose an area of interest¶

In [4]:
# Set your own latitude / longitude

# PNG
# latitude = (-4.26, -3.75)
# longitude = (144.03, 144.74)
# time = ('2020-01-01', '2020-05-31')

# Bangladesh
latitude = (21.5, 23.5)
longitude = (89, 90.5)
time = ('2024-05-01', '2024-06-10')

# Vietnam
# epsg:32648
# latitude = (9.1, 9.9)
# longitude = (105.6, 106.4)
# time = ('2024-01-01', '2024-09-10')

display_map(longitude, latitude)
Out[4]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Load data¶

In [5]:
data = dc.load(
    product = product, 
    latitude = latitude,
    longitude = longitude,
    time = time,
    dask_chunks = {'latitude':2048, 'longitude':2048},      # Dask chunk size
    group_by = 'solar_day',                    # Group by day method
)

display(xarray_object_size(data))
display(data)
'Dataset size: 10.90 GB'
<xarray.Dataset> Size: 12GB
Dimensions:      (time: 13, latitude: 10000, longitude: 7500)
Coordinates:
  * time         (time) datetime64[ns] 104B 2024-05-02T23:48:28.500000 ... 20...
  * latitude     (latitude) float64 80kB 23.5 23.5 23.5 23.5 ... 21.5 21.5 21.5
  * longitude    (longitude) float64 60kB 89.0 89.0 89.0 89.0 ... 90.5 90.5 90.5
    spatial_ref  int32 4B 4326
Data variables:
    vh           (time, latitude, longitude) float32 4GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
    vv           (time, latitude, longitude) float32 4GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
    angle        (time, latitude, longitude) float32 4GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 13
    • latitude: 10000
    • longitude: 7500
    • time
      (time)
      datetime64[ns]
      2024-05-02T23:48:28.500000 ... 2...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2024-05-02T23:48:28.500000000', '2024-05-05T12:04:31.500000000',
             '2024-05-10T12:12:39.500000000', '2024-05-14T23:48:27.500000000',
             '2024-05-17T12:04:31.500000000', '2024-05-19T23:56:45.500000000',
             '2024-05-22T12:12:38.500000000', '2024-05-26T23:48:27.500000000',
             '2024-05-29T12:04:31.500000000', '2024-05-31T23:56:44.500000000',
             '2024-06-03T12:12:38.500000000', '2024-06-07T23:48:27.500000000',
             '2024-06-10T12:04:30.500000000'], dtype='datetime64[ns]')
    • latitude
      (latitude)
      float64
      23.5 23.5 23.5 ... 21.5 21.5 21.5
      units :
      degrees_north
      resolution :
      -0.0002
      crs :
      EPSG:4326
      array([23.4999, 23.4997, 23.4995, ..., 21.5005, 21.5003, 21.5001])
    • longitude
      (longitude)
      float64
      89.0 89.0 89.0 ... 90.5 90.5 90.5
      units :
      degrees_east
      resolution :
      0.0002
      crs :
      EPSG:4326
      array([89.0001, 89.0003, 89.0005, ..., 90.4995, 90.4997, 90.4999])
    • spatial_ref
      ()
      int32
      4326
      spatial_ref :
      GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]]
      grid_mapping_name :
      latitude_longitude
      array(4326, dtype=int32)
    • vh
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
      units :
      intensity
      nodata :
      nan
      crs :
      EPSG:4326
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.63 GiB 16.00 MiB
      Shape (13, 10000, 7500) (1, 2048, 2048)
      Dask graph 260 chunks in 1 graph layer
      Data type float32 numpy.ndarray
      7500 10000 13
    • vv
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
      units :
      intensity
      nodata :
      nan
      crs :
      EPSG:4326
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.63 GiB 16.00 MiB
      Shape (13, 10000, 7500) (1, 2048, 2048)
      Dask graph 260 chunks in 1 graph layer
      Data type float32 numpy.ndarray
      7500 10000 13
    • angle
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
      units :
      degrees
      nodata :
      nan
      crs :
      EPSG:4326
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.63 GiB 16.00 MiB
      Shape (13, 10000, 7500) (1, 2048, 2048)
      Dask graph 260 chunks in 1 graph layer
      Data type float32 numpy.ndarray
      7500 10000 13
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2024-05-02 23:48:28.500000', '2024-05-05 12:04:31.500000',
                     '2024-05-10 12:12:39.500000', '2024-05-14 23:48:27.500000',
                     '2024-05-17 12:04:31.500000', '2024-05-19 23:56:45.500000',
                     '2024-05-22 12:12:38.500000', '2024-05-26 23:48:27.500000',
                     '2024-05-29 12:04:31.500000', '2024-05-31 23:56:44.500000',
                     '2024-06-03 12:12:38.500000', '2024-06-07 23:48:27.500000',
                     '2024-06-10 12:04:30.500000'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • latitude
      PandasIndex
      PandasIndex(Float64Index([           23.4999,            23.4997,            23.4995,
                               23.4993,            23.4991,            23.4989,
                               23.4987,            23.4985,            23.4983,
                               23.4981,
                    ...
                               21.5019,            21.5017,            21.5015,
                               21.5013,            21.5011,            21.5009,
                    21.500700000000002,            21.5005,            21.5003,
                               21.5001],
                   dtype='float64', name='latitude', length=10000))
    • longitude
      PandasIndex
      PandasIndex(Float64Index([          89.0001, 89.00030000000001,           89.0005,
                    89.00070000000001,           89.0009, 89.00110000000001,
                              89.0013, 89.00150000000001,           89.0017,
                              89.0019,
                    ...
                    90.49810000000001,           90.4983,           90.4985,
                              90.4987,           90.4989,           90.4991,
                              90.4993,           90.4995,           90.4997,
                              90.4999],
                   dtype='float64', name='longitude', length=7500))
  • crs :
    EPSG:4326
    grid_mapping :
    spatial_ref

Prepare the data¶

In [7]:
# intensity = amplitude * amplitude
# amplitude = sqrt(intensity)
# dB = 10*log10(intensity)
# intensity = 10**(dB/10)

import math
def intensity_to_db(x):
    return 10*math.log10(x)
def db_to_intensity(db):
    return math.pow(10, db/10.0)

# intensity_to_db(0.5) # = -3
# intensity_to_db(0.001) # = -30
# db_to_intensity(-30) # = 0.001
# db_to_intensity(-3) # = 0.5

db_to_intensity(-19) # = 0.0125

# https://docs.xarray.dev/en/stable/user-guide/dask.html#automatic-parallelization-with-apply-ufunc-and-map-blocks
# https://docs.xarray.dev/en/stable/generated/xarray.apply_ufunc.html#xarray.apply_ufunc
# https://docs.xarray.dev/en/stable/examples/apply_ufunc_vectorize_1d.html

# Apply numpy.log10 to the DataArray
# log10_data = xr.apply_ufunc(np.log10, data)
Out[7]:
0.012589254117941675
In [7]:
# Optional filters

# Select time layers with at least 5% of valid pixels
selected = data.vv.count(dim=['latitude','longitude']).values / (data.sizes['latitude']*data.sizes['longitude']) >= 0.05

data = data.sel(time=selected).persist()

Plot the data¶

In [11]:
def make_image(ds, frame_height=300, **kwargs):
    defaults = dict(
        cmap="Greys_r",
        x = 'longitude', y = 'latitude',
        rasterize=True,
        geo=True,
        frame_height=frame_height,
    )
    defaults.update(**kwargs)
    return ds.hvplot.image(**defaults)
In [13]:
# A single time layer for VV and VH, with linked axes

vvplot = make_image(data.vv.isel(time=1), clim=(0, 0.5), title=f'VV ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='Intensity')
vhplot = make_image(data.vh.isel(time=1), clim=(0, 0.1), title=f'VH ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='Intensity')
vvplot + vhplot
Out[13]:
In [ ]:
# Make a dB plot

# vvplot = make_image(intensity_to_db(data.vv.isel(time=0)), clim=(-30, -3), title=f'VV ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='DB')
# vhplot = make_image(intensity_to_db(data.vh.isel(time=0)), clim=(-30, -1), title=f'VH ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='DB')
# vvplot + vhplot
In [14]:
# Subplots for each time layer for VV, with linked axes

make_image(data.vv, clim=(0,0.5)).layout().cols(4)
Out[14]:

Make an RGB image¶

For an RGB visualization we use the ratio between VH and VV.

In [17]:
# Add the vh/vv band
data['vh_vv'] = data.vh / data.vv

# Scale the measurements by their median so they have a similar range for visualization
# med = data / data.median(dim=['latitude','longitude'])
med = data

# Create an RGB array, and persist it on the dask cluster
rgb_ds = xr.concat([med.vv, med.vh, med.vh_vv], 'channel').rename('rgb').to_dataset().persist()
In [18]:
# Plot the RGB
rgb_plot = rgb_ds.hvplot.rgb(
    bands='channel',
    groupby='time', rasterize=True,
    geo=True, # crs=ccrs.PlateCarree(), projection=ccrs.PlateCarree(),
    title='RGB', frame_height=500,
)

rgb_plot  # + vv_plot + vh_plot
WARNING:param.GeoRGBPlot31140: Clipping input data to the valid range for RGB data ([0..1] for floats or [0..255] for integers).
WARNING:param.GeoRGBPlot31140: Clipping input data to the valid range for RGB data ([0..1] for floats or [0..255] for integers).
Out[18]:

Export to Geotiffs¶

Recall that to write a dask dataset to a file requires the dataset to be .compute()ed. This may result in a large memory increase on your JupyterLab node if the area of interest is large enough, which in turn may kill the kernel. If so then skip this step, choose a smaller area or find a different way to export data.

In [ ]:
# Make a directory to save outputs to
target = Path.home() / 'output'
if not target.exists(): target.mkdir()

def write_band(ds, varname):
    """Write the variable name of the xarray dataset to a Geotiff files for each time layer"""
    for i in range(len(ds.time)):
        date = ds[varname].isel(time=i).time.dt.strftime('%Y%m%d').data
        single = ds[varname].isel(time=i).compute()
        write_cog(geo_im=single, fname=f'{target}/example_sentinel-1_{varname}_{date}.tif', overwrite=True)
        
write_band(data, 'vv')
write_band(data, 'vh')
# write_band(rgb_da, 'rgb')
In [ ]: